Syntax and Structure in Statistical Translation
نویسندگان
چکیده
In this paper, we describe a sourceside reordering method based on syntactic chunks for phrase-based statistical machine translation. First, we shallow parse the source language sentences. Then, reordering rules are automatically learned from source-side chunks and word alignments. During translation, the rules are used to generate a reordering lattice for each sentence. Experimental results are reported for a Chinese-to-English task, showing an improvement of 0.5%–1.8% BLEU score absolute on various test sets and better computational efficiency than reordering during decoding. The experiments also show that the reordering at the chunk-level performs better than at the POS-level.
منابع مشابه
Statistical Translation Model Based On Source Syntax Structure
Syntax-based statistical translation model is proved to be better than phrasebased model, especially for language pairs with very different syntax structures, such as Chinese and English. In this talk I will introduce a serial of statistical translation models based on source syntax structure. The tree-based model uses the one best syntax tree for translation. The forest-based model uses a comp...
متن کاملA new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملSyntax-Directed Phrase-based Statistical Machine Translation
The SDTS has been applied in the field of compiler and in transfer-based machine translation. After the parsing step, the syntactic structure of a sentence is identified. The parse tree will be analyzed, augmented, and transformed by later phases in the SMT system. Those phases are controlled by syntax. We use the stochastic SDTS to model such kind of translation process for phrase-based SMT. T...
متن کاملA Detailed Analysis of Phrase-based and Syntax-based Machine Translation: The Search for Systematic Differences
This paper describes a range of automatic and manual comparisons of phrase-based and syntax-based statistical machine translation methods applied to English-German and English-French translation of user-generated content. The syntax-based methods underperform the phrase-based models and the relaxation of syntactic constraints to broaden translation rule coverage means that these models do not n...
متن کاملUsing Grammatical Roles to Improve Statistical Machine Translation
Statistical machine translation systems often struggle to preserve predicateargument structure. We present a new hierarchical machine translation model that explicitly captures the grammatical roles taken on by the words and phrases being translated (e.g., subject, object, and indirect object). Although existing hierarchical and syntax-based grammars can capture how many arguments a predicate t...
متن کاملPost-ordering by Parsing for Japanese-English Statistical Machine Translation
Reordering is a difficult task in translating between widely different languages such as Japanese and English. We employ the postordering framework proposed by (Sudoh et al., 2011b) for Japanese to English translation and improve upon the reordering method. The existing post-ordering method reorders a sequence of target language words in a source language word order via SMT, while our method re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007